WC3: Analyzing the Style of Metadata Annotation Among Wikipedia Articles by Using Wikipedia Category and the DBpedia Metadata Database
نویسنده
چکیده
WC3 (Wikipedia Category Consistency Checker) is a system that supports the analysis of the metadata-annotation style in Wikipedia articles belonging to a particular Wikipedia category (the subcategory of “Categories by parameter”) by using the DBpedia metadata database. This system aims to construct an appropriate SPARQL query to represent the category and compares the retrieved results and articles that belong to the category. In this paper, we introduce WC3 and extend the algorithm to analyze efficiently additional varieties of Wikipedia category. We also discuss the metadata-annotation quality of the Wikipedia by using WC3.
منابع مشابه
Multilingual Named Entity Recognition using Parallel Data and Metadata from Wikipedia
In this paper we propose a method to automatically label multi-lingual data with named entity tags. We build on prior work utilizing Wikipedia metadata and show how to effectively combine the weak annotations stemming from Wikipedia metadata with information obtained through English-foreign language parallel Wikipedia sentences. The combination is achieved using a novel semi-CRF model for forei...
متن کاملDCU at WikipediaMM 2009: Document Expansion from Wikipedia Abstracts
In this paper, we describe our participation in the WikipediaMM task at CLEF 2009. Our main efforts concern the expansion of the image metadata from the Wikipedia abstracts collection DBpedia. Since the metadata is short for retrieval by query words, we decided to expand the metadata using a typical query expansion method. In our experiments, we use the Rocchio algorithm for document expansion....
متن کاملTowards Using Wikipedia as a Substitute Corpus for Topic Detection and Metadata Generation in E-Learning
Metadata is crucial for reuse of Learning Resources. Only with good metadata, there is a chance that a Learning Resource can be successfully found in a repository. However, many Learning Resources are still delivered with no or little attached metadata. Automatic metadata generation is used to put things right either as assistance for the author, or as part of a repository’s retrieval functiona...
متن کاملWorld Literature According to Wikipedia: Introduction to a DBpedia-Based Framework
Among the manifold takes on world literature, it is our goal to contribute to the discussion from a digital point of view by analyzing the representation of world literature in Wikipedia with its millions of articles in hundreds of languages. As a preliminary, we introduce and compare three different approaches to identify writers on Wikipedia using data from DBpedia, a community project with t...
متن کاملWikiCat A graph-based algorithm for categorizing Wikipedia articles
In 2005 Wikipedia implemented a category system for the purposes of facilitating navigation throughout the site. Since then, it has grown to 1.5 million categories covering over four million articles. Currently, all categorization of articles on Wikipedia is done by human editors—a task which involves an enormous amount of repetitive work. This paper details the results of building an automated...
متن کامل